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Abstract We determined the complete genomic sequence 
of the human CD79b (Igp/B29) gene. The CD79b gene 
product is associated with the membrane immunoglobulin 
signaling complex which is composed of immunoglobulin 
(Ig) itself, associated in a noncovalent fashion with CD79b 
and a second polypeptide chain, CD79a (Igo/mbl). The 
sequence and exon/intron organization of the human and 
mouse CD79b genes are highly similar. The gene organi- 
zation suggests that some variant forms of CD79b may 
arise by virtue of alternative splicing of mRNA. In addition, 
a number of conserved regulatory sequences commonly 
found in Ig genes are present in sequences which flank the 
human CD79b gene. Some of these sequences are distinct 
from those found in the CD79a promoter. These differences 
may explain why transcription of CD79b, but not CD79a, is 
observed in plasma cells. A new Taq 1 restriction fragment 
length polymorphism is described that is not associated 
with any structural polymorphisms of the expressed CD79b 
polypeptide. 



Introduction 

The recognition of specific antigen by B lymphocytes de- 
pends on the function of the membrane immunoglobulin 
(mig) receptor complex, a multimeric complex which, in 
addition to mIg, includes at least two associated glyco- 
proteins designated CD79a (Igo/mbl) and CD79b (IgP/ 
B29) (Reth 1992; Schlossman et al. 1994). These immu- 
noglobulin(lg)-associated molecules participate in signal 
transduction and contain highly conserved intracytoplasmic 
domains which appear to associate with intracellular sig- 
nalling molecules. The CD79b molecule is of particular 
interest because it appears that several different forms of 
CD79b may exist (Friedrich et al. 1993; Hashimoto and 
co-workers, unpublished results), the molecular basis of 
which remains undefined. In addition, unlike CD79a (Iga/ 
mbl), the transcription of CD79b is maintained in mIg 
plasma cells (Hashimoto et al. 1993). In order to begin to 
understand these features of CD79b, we determined the 
complete sequence of the CD79b gene, and in the course of 
this work we also defined a new Taq I polymorphism. 



The nucleotide sequence data reported in this paper have been sub- 
mitted to the GenBank nucleotide sequence database and have been 
assigned the accession number L27587 
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Materials and methods 

Isolation of cosmid clones 

A chromosome 17-specific cosmid library was obtained from the 
Reference Library Data Base of the Imperial Cancer Research Fund 
(London, Great Britain). Filters #1 and #2 (Library number 105, set 12) 
were screened with a cDNA probe containing the full-length CD79b 
coding sequence, using standard conditions of high-stringency hy- 
bridization. Duplicate positive colonies were identified by position on 
the filter grid, and provided to us by the Reference Library. 



DM4 sequencing strategy 

DNA sequencing was performed by using a Taq DyeDeoxy™ Ter- 
minator Cycle Sequencing Kit (ABI, Foster City. CA). Primers specific 
for portions of the CD79b gene were synthesized (Biosearch 8700; 
Millipore, Bedford, MA) as needed for sequence analysis by a primer 
walking strategy. Twenty-two primers were used to complete the se- 
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Fig. 1 The complete DNA se- 
quence of the human CD79b 
(IgP/B29) gene. Potential regula- 
tory sequences are underlined and 
labeled above. The exon/intron 
borders are indicated by a hori- 
zontal line, and the adjacent 
splice donor and acceptor se- 
quences are underlined. The 
polyadenylation signal is indi- 
cated in bold 
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quence analysis of approximately 3800 base pairs (bp). Sequencing 
reactions were analyzed on an Applied Biosystems (Foster City, CA) 
Automated Sequenator Model 373 A. The most 3' sequencing primer 
F13 (5' CGG ACA GGG GAA GTG AAG TG 3') was used to confirm 
the 3' extent of cosmid clones. 



Southern blot analysis 

Ten (ig of genomic DNA were digested overnight with Taq 1 (Gibco/ 
BRL, Gaithersberg, MD), electrophoresed through 1% agarose gel, and 
transferred onto nylon membranes (Micron Separations, Westboro, 
MA). After cross-linking using the Stratalinker UV cross-linker 
(Stratagene, La JoUa, CA) for 90 s, the filters were prehybridized for 
15 min at 60° C in QuikHyb solution (Stratagene). Filters then were 
e.Kposed for 1 h at 60° C in'QuikHyb solution to a 32p-iabeled, 800 bp 
probe derived from the IgP cDNA, and subsequently washed twice for 
15 min at rc^m temperature in 2 x standard sodium citrate (SSC), 0.1% 
sodium dodecyl sulfate (SDS), followed by a 30 min wash at 65° C in 
0. 1 X SSC, 0. 1 % SDS. Filters were exposed overnight on Kodak XAR5 
film at -70° C. 
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Fig. 2 Sequence comparison of the 5' UT regions of the human and 
mouse (Hermanson et al. 1988) CD79b genes. Conserved regulatory 
sequences are indicated 



Results 

Isolation of cosmid clones for CD79b 

The CD79b gene has been localized to chromosome 17q23 
(Wood et al. 1993). We therefore obtained a chromosome 
17-specific cosmid library for screening from the Reference 
Library Data Base of the Imperial Cancer Research Fund. 
The cosmid library was screened with a cDNA clone for 
human CD79b which was previously isolated in this labo- 
ratory (Hashimoto et al. 1993). Seven positive clones were 
identified on the initial screening. These clones were pro- 
vided to us by the ICRF laboratories and we subsequently 
identified two cosmids which appeared to contain the entire 
CD79b gene. These clones were identified by their ability 
to be amplified by PCR primers specific for the 5' end of 
the CD79b cDNA sequence (primer pairs F24/R5), and by 
successful priming of sequence reactions, using a primer 
(F13) specific for the 3' end of the the cDNA sequence. 



Sequence analysis of the CD79b gene 

The entire genomic sequence of the CD79b gene was ob- 
tained and is shown in Figure 1. The overall intron/exon 
organization is highly similar to that reported for the mouse 
CD79b gene (Reth 1992). The CD79b genes of both human 
and mouse contain 6 exons. Exon 1 encodes a typical, leader 
peptide sequence; this is followed by exon 2 which encodes 
a short peptide preceeding the Ig-like domain encoded by 
exon 3. Exon 4 encodes the hydrophobic transmembrane 
(TM) portion of the CD79b molecule, whereas the coding 
regions for the intracytoplasmic (CP) domains are divided 
between exons 5 and 6. As noted previously (Hashimoto et 
^- 1993), the mouse and human sequences are highly 
Similar in sequence, especially in the TM and CP regions 
\^95% similarity in predicted amino acid sequence), as 
as sharing the conserved exon/intron structure shown 



here. A 3' untranslated (UT) region of 461 bp separates the 
stop codon from the polyadenylation signal AATAAA. 

In addition to interspecies similarities in the coding re- 
gion and gene organization, there are also conserved 
c/j-acting regulatory elements which are present in the 
flanking sequence of human CD79b. A comparison of the 
human and mouse 5' promoter region is shown in Figure 2. 
This region contains sequences which are known to be in- 
volved in the regulation of Ig genes. Among these are the 
highly conserved octomer sequence, as well as an ets 
binding site and an SPl target sequence. A uB/LyFl 
binding region was identified in the mouse and is partially 
conserved in the human sequence. An additional enhancer 
(E3-like) element was tentatively identified in the pre- 
viously reported mouse sequence, and is present in a 
slighdy altered form in the human CD79b promoter. An 
E2-like sequence is also found in the human sequence, but 
was not observed in the mouse (Hermanson et al. 1989). 
Both the human and mouse promoter sequences lack a 5' 
TATA box. Consistent with this, the mouse and human 
CD79b transcripts have been shown to initiate at multiple 
sites downstream of the octomer sequence (Hermanson et 
al. 1989; Omori and Wall 1993). 

Enhancer elements can also be found in the 3' end of Ig 
genes (Staudt and Lenardo 1991), and for this reason we 
searched the 3' region of human CD79b for such sequences. 
We identified four potential sites of E elements, all of 
which had 7/8 nucleotides identical to the previously de- 
scribed E elements (Staudt and Lenardo 1991). These are 
indicated in Figure 1. 



Identification of RFLPs in the human CD79b gene 

We identified an allelic polymorphism associated with the 
CD79b gene after digestion with Taq 1. As shown in Fig- 
ure 3, two bands are observed in homozygous individuals. 
One band at 2.2 kilobases (kb) is constant in all individuals. 
A second band reflects a Taq 1 polymorphism which is 
biallelic in the population, with either a 1.8 kb or 2.5 kb 
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Fig. 3 Southern blot analysis of Taq I-digested genomic DNA from 
five unrelated individuals. A full-length CD79b cDNA probe was used. 
Two alleles have been observed in the population. Lane 1 shows an 
individual homozygous for the 2.5 kb allele, while lanes 2-4 show 
individuals homozygous for the 1.8 kb allele. The individual in lane 5 
is heterozygous. All subjects contain a constant, and relatively weak, 
band at 2.2 kb 



Table 1 Distribution of Igp Taq 1 RFLPs in various ethnic groups. 
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An asterisk indicates that Southern blot analysis of Taq I-digested 
genomic DNA, using a full-length CD79b cDNA probe, identified two 
polymorphic fragments of 2.5 kb and 1.8 kb 



band present. Heterozygotes (lane 5, Figure 3) contain both 
of these bands, as well as the constant 2.2 kb band. 

We analyzed 60 unrelated individuals for the presence of 
the Taq I RFLP, Subjects from various ethnic groups were 
examined, and Table 1 shows the results of this study 
broken down by ethnic group. AH three ethnic populations 
appear to be in Hardy Weinberg equilibrium (null hypoth- 
esis not rejected by goodness of fit test). 



Discussion 

In this study we determined the complete sequence of the 
human CD79b (IgP/B29) gene, and the sequence reveals a 
remarkable degree of similarity between the mouse and 
human homologues. These similarities relate to both se- 
quence conservation within the coding region, as well as 
preservation of the overall exon/intron organization. The 
gene organization is of some interest for understanding the 
mechanisms by which variant forms of the CD79b mole- 
cule may be produced. It has recently been reported that a 
truncated form of mouse CD79b protein exists in which the 
C terminal portion of the cj^oplasmlc tail is deleted 
(Friedrich et al. 1993). The exact extent of this deletion has 
not been determined, nor has this been reported as yet in 
human CD79b. 

Nevertheless, the exon/intron structure of the gene 
suggests that a possible mechanism for this deletion might 
be altemative splicing at the 3' end of the gene, such that 
exon 6 is removed from some transcripts. In addition, using 
an RNAse protection assay, we observed truncated tran- 
scripts of the human CD79b in which a large segment of the 
extracellular portion of the molecule had been deleted 
(unpublished results). Interestingly, this deletion corre- 
sponds exactly to the removal of exon 3, again suggesting 
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alternative splicing as a mechanism for this variant tran- 
script. The functional significance of these variant forms of 
CD79b has not yet been established. 

A second interesting feature of the Ig-associated sig- 
naling molecules is their distinct pattern of transcriptional 
regulation. The CD79b gene is transcriptionally active 
throughout all stages of B-cell differentiation (Hermanson 
et al. 1989). Thus, CD79b is expressed with approximately 
the same timing as the onset of Ig heavy chain rearrange- 
ment, an event associated with Ig gene transcription 
(Blackwell et al. 1986). Consistent with this pattern of 
expression, the CD79b gene contains regulatory elements 
commonly found in Ig enhancers. Some of these elements, 
such as the octomer motif, are identical in the mouse and 
human CD79b sequences (see Figure 2), as well as in the Ig 
genes (Staudt and Lenardo 1991). This sequence is known 
to form a binding site for transcription factors Octl and 
Oct2. Other putative enhancer elements also exhibit simi- 
larity to the previously defined Ig enhancer sequences. 
Among these are the E3-like and E2-like sequences present 
in the human CD79b promoter region. It has recently been 
shown that the E3-Iike sequence in mouse IgP binds nuclear 
extracts in gel shift assays (Omori and Wall 1993). In ad- 
dition to these 5' enhancer elements, we have also identi- 
fied candidate enhancer sequences in the 3' end of the gene, 
similar to the location of such enhancers in Ig genes. Four 
such regions are shown in Figure 1 , each of which matches 
the published Ig enhancer elements in 7/8 nucleotides. 
Further studies will be required to determine whether these 
putative regulatory motifs are actually involved in the 
regulation of CD79b transcription. 

As described previously, the major function of the 
CD79b gene product is in the formation of a membrane 
signalling complex associated with Ig at the surface of 
B cells (Reth 1992). At a minimum, this complex consists 
of Ig itself associated in a noncovalent fashion with CD79b 
and a second polypeptide chain, CD79a (Iga/mbl). As we 
have reported previously, the CD79b gene is transcribed at 
high levels in plasma cells (Hashimoto et al. 1993). In 
contrast, the CD79a gene is not transcribed in plasma cells 
(Ha et al. 1992), consistent with the absence of surface Ig in 
these cells. The CD79a gene contains a promoter region 
which bears partial similarity to that seen in CD79b; both 
the ets and SPl binding sites are present in the CD79a 
promoter, which also lacks a TATA box. However, no 
E3-like element is present in CD79a. In addition, a putative 
c/^-acting sequence has been recently identified in CD79a 
which is the target for a novel DNA binding protein, BlyF 
(Feldhaus et al. 1992). BLyF is not present in plasma cells, 
and this may partially account for the lack of transcription 
of CD79a in plasma cells. Interestingly, a BlyF binding 
motif is not present in the CD79b 5' promoter region, again 
emphasizing the distinct pattems of regulation for these two 
genes. Given the absence of membrane Ig in plasma cells, it 
is at present unclear what function CD79b has at this stage 
of B-cell differentiation. 

Wood and co-workers (1993) have recently shown that 
the human CD79b gene maps specifically to 17q23. This is 
of interest in view of the fact that translocations involving 
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I7q23 have been observed in some cases of lymphocytic 
leukemia (Mitehnan et al. 1990). It is tempting to speculate 
that the presence of numerous enhancer elements for B-cell 
transcription within the CD79b gene might enhance the 
expression of oncogenes translocated into this region. 
However, to our knowledge, a detailed molecular analysis 
of these translocations has not been performed. 

Finally, we identified a new Taq 1 polymorphism of the 
CD79b gene. As shown in Table 1, a preliminary analysis of 
gene frequencies within various ethnic groups shows a 
relative preponderance of the 1.8 kb Taq 1 allele with an 
overall heterozygosity (H) of 0.33; the distribution of 
genotypes is consistent with Hardy-Weinberg equilibrium 
in all three ethnic groups. We did not detect any coding 
region polymorphisms associated with this Taq 1 RFLP, 
although we hate not yet searched for other allelic poly- 
morphisms in the flanking regulatory regions. 
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